Fine-grained categorization via CNN-based automatic extraction and integration of object-level and part-level features

نویسندگان

  • Ting Sun
  • Lin Sun
  • Dit-Yan Yeung
چکیده

Fine-grained categorization can benefit from part-based features which reveal subtle visual differences between object categories. Handcrafted features have been widely used for part detection and classification. Although a recent trend seeks to learn such features automatically using powerful deep learning models such as convolutional neural networks (CNN), their training and possibly also testing require manually provided annotations which are costly to obtain. To relax these requirements, we assume in this study a general problem setting in which the raw images are only provided with objectlevel class labels for model training with no other side information needed. Specifically, by extracting and interpreting the hierarchical hidden layer features learned by a CNN, we propose an elaborate CNN-based system for fine-grained categorization. When evaluated on the Caltech-UCSD Birds-200-2011, FGVC-Aircraft, Cars and Stanford dogs datasets under the setting that only object-level class labels are used for training and no other annotations are available for both training and testing, our method achieves impressive performance that is superior or comparable to the state of the art. Moreover, it sheds some light on ingenious use of the hierarchical features learned by CNN which has wide applicability well beyond the current fine-grained categorization task.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Real Time Fine-Grained Categorization with Accuracy and Interpretability

A well-designed fine-grained categorization system usually has three contradictory requirements: accuracy (the ability to identify objects among subordinate categories); interpretability (the ability to provide human-understandable explanation of recognition system behavior); and efficiency (the speed of the system). To handle the trade-off between accuracy and interpretability, we propose a no...

متن کامل

Efficient Two-Step Middle-Level Part Feature Extraction for Fine-Grained Visual Categorization

Fine-grained visual categorization (FGVC) has drawn increasing attention as an emerging research field in recent years. In contrast to generic-domain visual recognition, FGVC is characterized by high intraclass and subtle inter-class variations. To distinguish conceptually and visually similar categories, highly discriminative visual features must be extracted. Moreover, FGVC has highly special...

متن کامل

Developing a New Method in Object Based Classification to Updating Large Scale Maps with Emphasis on Building Feature

According to the cities expansion, updating urban maps for urban planning is important and its effectiveness is depend on the information extraction / change detection accuracy. Information extraction methods are divided into two groups, including Pixel-Based (PB) and Object-Based (OB). OB analysis has overcome the limitations of PB analysis (producing salt-pepper results and features with hole...

متن کامل

Integration of Visible Image and LIDAR Altimetric Data for Semi-Automatic Detection and Measuring the Boundari of Features

This paper presents a new method for detecting the features using LiDAR data and visible images. The proposed features detection algorithm has the lowest dependency on region and the type of sensor used for imaging, and about any input LiDAR and image data, including visible bands (red, green and blue) with high spatial resolution, identify features with acceptable accuracy. In the proposed app...

متن کامل

Object-Oriented Method for Automatic Extraction of Road from High Resolution Satellite Images

As the information carried in a high spatial resolution image is not represented by single pixels but by meaningful image objects, which include the association of multiple pixels and their mutual relations, the object based method has become one of the most commonly used strategies for the processing of high resolution imagery. This processing comprises two fundamental and critical steps towar...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • Image Vision Comput.

دوره 64  شماره 

صفحات  -

تاریخ انتشار 2017